Skip to content

Add evaluation script for knowledgeqa#43

Merged
amrit110 merged 21 commits intomainfrom
ak/add_evaluate_script
Feb 17, 2026
Merged

Add evaluation script for knowledgeqa#43
amrit110 merged 21 commits intomainfrom
ak/add_evaluate_script

Conversation

@amrit110
Copy link
Member

@amrit110 amrit110 commented Feb 10, 2026

Summary

Clickup Ticket(s): Link(s) if applicable.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • 🔧 Refactoring (no functional changes)
  • ⚡ Performance improvement
  • 🧪 Test improvements
  • 🔒 Security fix

Changes Made

  • Add evaluation script for knowledge qa use case
  • Refactor web tool to use helper function
  • Add grader for deesearchqa

Testing

  • Tests pass locally (uv run pytest tests/)
  • Type checking passes (uv run mypy <src_dir>)
  • Linting passes (uv run ruff check src_dir/)
  • Manual testing performed (describe below)

Manual testing details:

  • Ran evaluate.py, checked langfuse to verify evals, traces.

Screenshots/Recordings

Related Issues

Deployment Notes

Checklist

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Documentation updated (if applicable)
  • No sensitive information (API keys, credentials) exposed

@amrit110 amrit110 changed the title Ak/add evaluate script Add evaluation script for knowledgeqa Feb 10, 2026
@amrit110
Copy link
Member Author

@fcogidi, i have to refactor my judges code after your factory PR gets merged.

@amrit110 amrit110 requested review from fcogidi and lotif February 10, 2026 13:49
@amrit110 amrit110 mentioned this pull request Feb 11, 2026
16 tasks
@amrit110 amrit110 requested a review from lotif February 13, 2026 21:30
@amrit110 amrit110 added enhancement New feature or request refactor Refactor or clean up code structure labels Feb 17, 2026
@amrit110 amrit110 self-assigned this Feb 17, 2026
Copy link
Collaborator

@lotif lotif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way better than before :)

@amrit110 amrit110 merged commit e522f6d into main Feb 17, 2026
3 checks passed
@amrit110 amrit110 deleted the ak/add_evaluate_script branch February 17, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request refactor Refactor or clean up code structure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments